Conversation
📝 WalkthroughWalkthroughThe changes extend the Changes
Estimated Code Review Effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
dbt/models/marts/marketing/marketing_excellent_rating_revenue.sql (1)
34-38:sum(product_total_revenue) over ()is evaluated twice — extract to a third CTE.In the current 2-CTE pattern, the window function is recomputed for
cohort_total_revenueand again insidepct_of_cohort_revenue. Adding awith_totalsCTE letsfinalreferencecohort_total_revenuedirectly, eliminating the redundancy.♻️ Proposed 3-CTE refactor
), +with_totals as ( + select + *, + sum(product_total_revenue) over () as cohort_total_revenue + from excellent_products +), + final as ( select - product_id, - product_name, - product_category, - unit_price, - avg_rating, - rating_tier, - review_count, - total_units_sold, - product_total_revenue, - product_gross_profit, - margin_pct, - sum(product_total_revenue) over () as cohort_total_revenue, + *, round( - product_total_revenue / nullif(sum(product_total_revenue) over (), 0) * 100, + product_total_revenue / nullif(cohort_total_revenue, 0) * 100, 2 ) as pct_of_cohort_revenue - from excellent_products + from with_totals )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dbt/models/marts/marketing/marketing_excellent_rating_revenue.sql` around lines 34 - 38, The window sum sum(product_total_revenue) over () is computed twice; add a third CTE (e.g., with_totals) that selects all columns from the existing CTE (e.g., cohort or whatever upstream CTE produces product_total_revenue), plus cohort_total_revenue := sum(product_total_revenue) over (), then update the final SELECT (in the final CTE or main query) to reference cohort_total_revenue directly and compute pct_of_cohort_revenue as round(product_total_revenue / nullif(cohort_total_revenue, 0) * 100, 2) so the window function is calculated once and reused.dbt/models/marts/core/schema.yml (1)
90-141: Consider addingnot_nulltests to key new columns indim_products.None of the 15 new columns carry any dbt tests. The downstream
marketing_excellent_rating_revenuemodel filters onrating_tierand aggregatesproduct_total_revenue— NULL values in either would silently skew the cohort (NULLs are excluded fromSUM) or produce NULLpct_of_cohort_revenuerows. At minimum,product_name,avg_rating,rating_tier, andproduct_total_revenuewarrantnot_nulltests.💡 Example additions
- name: product_name description: Human-readable name of the product data_type: varchar + tests: + - not_null ... - name: avg_rating description: Average customer review rating on a 1–5 scale, derived from int_product_reviews_agg data_type: numeric + tests: + - not_null ... - name: rating_tier description: > ... data_type: varchar + tests: + - not_null ... - name: product_total_revenue description: Total gross revenue generated by the product (units_sold × unit_price) data_type: numeric + tests: + - not_null🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dbt/models/marts/core/schema.yml` around lines 90 - 141, The schema for model dim_products is missing not_null tests on key new columns which can produce silent NULL-skewed analytics; add dbt not_null tests under the dim_products model for product_name, avg_rating, rating_tier, and product_total_revenue (and consider total_units_sold/product_gross_profit if applicable) by adding a tests: [not_null] entry for each column in dbt/models/marts/core/schema.yml so dbt will validate these fields at run time.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@dbt/models/marts/marketing/schema.yml`:
- Around line 280-282: Update the margin_pct column description to match the
canonical wording from dim_products: replace the current text ("Gross margin
percentage — product_gross_profit / product_total_revenue × 100") with the
dim_products phrasing ("unit_margin / unit_price × 100") so the schema entry for
margin_pct aligns exactly with the authoritative definition used in
dim_products; locate the margin_pct field in this file and update its
description string accordingly.
- Around line 226-230: The business_logic test "Percentage of cohort revenue
sums to 100" is too strict (diff > 0.01) given pct_of_cohort_revenue is rounded
to 2 decimals and per-row rounding error accumulates; update the test SQL to use
a wider tolerance (e.g., change the WHERE clause threshold from 0.01 to 0.25) or
alternatively change the model that produces pct_of_cohort_revenue to round(...,
4) and keep a tighter threshold; modify the SQL in the test (the sql field
referencing {{ model }} and pct_of_cohort_revenue) to implement one of these
fixes.
---
Nitpick comments:
In `@dbt/models/marts/core/schema.yml`:
- Around line 90-141: The schema for model dim_products is missing not_null
tests on key new columns which can produce silent NULL-skewed analytics; add dbt
not_null tests under the dim_products model for product_name, avg_rating,
rating_tier, and product_total_revenue (and consider
total_units_sold/product_gross_profit if applicable) by adding a tests:
[not_null] entry for each column in dbt/models/marts/core/schema.yml so dbt will
validate these fields at run time.
In `@dbt/models/marts/marketing/marketing_excellent_rating_revenue.sql`:
- Around line 34-38: The window sum sum(product_total_revenue) over () is
computed twice; add a third CTE (e.g., with_totals) that selects all columns
from the existing CTE (e.g., cohort or whatever upstream CTE produces
product_total_revenue), plus cohort_total_revenue := sum(product_total_revenue)
over (), then update the final SELECT (in the final CTE or main query) to
reference cohort_total_revenue directly and compute pct_of_cohort_revenue as
round(product_total_revenue / nullif(cohort_total_revenue, 0) * 100, 2) so the
window function is calculated once and reused.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
dbt/models/marts/core/schema.ymldbt/models/marts/marketing/marketing_excellent_rating_revenue.sqldbt/models/marts/marketing/schema.yml
| - name: "Percentage of cohort revenue sums to 100" | ||
| type: "business_logic" | ||
| sql: "SELECT * FROM (SELECT ABS(SUM(pct_of_cohort_revenue) - 100) AS diff FROM {{ model }}) t WHERE diff > 0.01" | ||
| expected_result: "0 rows" | ||
| description: "The sum of pct_of_cohort_revenue across all Excellent products must equal 100 (within 0.01 tolerance)" |
There was a problem hiding this comment.
The 0.01 tolerance for the sum-to-100 check will produce false failures for cohorts with more than a few products.
pct_of_cohort_revenue is round(..., 2), so each row carries up to ±0.005 rounding error. With N products the cumulative deviation from 100 can reach N × 0.005. A cohort of 7 equal-share products already yields a diff of 0.03, which exceeds the 0.01 threshold and triggers the test.
A simple fix is to widen the tolerance to accommodate realistic cohort sizes. If you expect up to ~50 products, a threshold of 0.25 (50 × 0.005) would be safe, or round to more decimal places in the SQL model (e.g., round(..., 4)) and tighten accordingly.
💡 Example tolerance adjustment
- sql: "SELECT * FROM (SELECT ABS(SUM(pct_of_cohort_revenue) - 100) AS diff FROM {{ model }}) t WHERE diff > 0.01"
+ sql: "SELECT * FROM (SELECT ABS(SUM(pct_of_cohort_revenue) - 100) AS diff FROM {{ model }}) t WHERE diff > 0.25"
- description: "The sum of pct_of_cohort_revenue across all Excellent products must equal 100 (within 0.01 tolerance)"
+ description: "The sum of pct_of_cohort_revenue across all Excellent products must equal 100 (within 0.25 tolerance, accounting for per-row rounding to 2 decimal places)"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@dbt/models/marts/marketing/schema.yml` around lines 226 - 230, The
business_logic test "Percentage of cohort revenue sums to 100" is too strict
(diff > 0.01) given pct_of_cohort_revenue is rounded to 2 decimals and per-row
rounding error accumulates; update the test SQL to use a wider tolerance (e.g.,
change the WHERE clause threshold from 0.01 to 0.25) or alternatively change the
model that produces pct_of_cohort_revenue to round(..., 4) and keep a tighter
threshold; modify the SQL in the test (the sql field referencing {{ model }} and
pct_of_cohort_revenue) to implement one of these fixes.
| - name: margin_pct | ||
| description: Gross margin percentage — product_gross_profit / product_total_revenue × 100 | ||
| data_type: numeric |
There was a problem hiding this comment.
margin_pct description diverges from the authoritative dim_products definition.
The core schema defines margin_pct as "unit_margin / unit_price × 100" (Line 106-107 of core/schema.yml), while this file describes it as "product_gross_profit / product_total_revenue × 100". The two are mathematically equivalent for a constant unit price, but the inconsistent wording across files will confuse downstream consumers trying to verify the formula against the source model.
Align the description here with the dim_products wording to keep the column's meaning unambiguous.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@dbt/models/marts/marketing/schema.yml` around lines 280 - 282, Update the
margin_pct column description to match the canonical wording from dim_products:
replace the current text ("Gross margin percentage — product_gross_profit /
product_total_revenue × 100") with the dim_products phrasing ("unit_margin /
unit_price × 100") so the schema entry for margin_pct aligns exactly with the
authoritative definition used in dim_products; locate the margin_pct field in
this file and update its description string accordingly.
test
Made with Cursor
Summary by CodeRabbit